ai community
Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor
In AI research and practice, rigor remains largely understood in terms of methodological rigor---such as whether mathematical, statistical, or computational methods are correctly applied. We argue that this narrow conception of rigor has contributed to the concerns raised by the responsible AI community, including overblown claims about the capabilities of AI systems. Our position is that a broader conception of what rigorous AI research and practice should entail is needed. We believe such a conception---in addition to a more expansive understanding of 1) methodological rigor---should include aspects related to 2) what background knowledge informs what to work on (epistemic rigor); 3) how disciplinary, community, or personal norms, standards, or beliefs influence the work (normative rigor); 4) how clearly articulated the theoretical constructs under use are (conceptual rigor); 5) what is reported and how (reporting rigor); and 6) how well-supported the inferences from existing evidence are (interpretative rigor). In doing so, we also provide useful language and a framework for much needed dialogue about the AI community's work by researchers, policymakers, journalists, and other stakeholders.
Sony AI table tennis robot outplays elite human players
In an article published today in Nature, Sony AI introduce Ace, the first robot to beat elite human players in competitive physical sport. Although AI systems have shown advanced performance in digital domains and board games (such as complex video games, chess and Go), translating this to physical performance has remained a significant challenge. Such a feat requires perception, planning, and control to work in a high-speed domain on the scale of milliseconds. Table tennis is a demanding and complex real-world test for robotics, requiring rapid decision-making, precise physical execution, and continuous adaptation to an unpredictable opponent. The ball's high speed, spin, and complex trajectories are central to competitive play.
WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking
While deep learning has revolutionized computer-aided drug discovery, the AI community has predominantly focused on model innovation and placed less emphasis on establishing best benchmarking practices. We posit that without a sound model evaluation framework, the AI community's efforts cannot reach their full potential, thereby slowing the progress and transfer of innovation into real-world drug discovery.Thus, in this paper, we seek to establish a new gold standard for small molecule drug discovery benchmarking, .
Congratulations to the #AAAI2026 award winners
A number of prestigious AAAI awards were presented during the official opening ceremony of the Fortieth AAAI Conference on Artificial Intelligence (AAAI 2026) in Singapore, on Thursday 22 January. The AAAI Award for Artificial Intelligence for Humanity recognises the positive impacts of artificial intelligence to protect, enhance, and improve human life in meaningful ways with long-lived effects. The winner of this year's award is Shakir Mohamed Shakir has been recognised for . The Robert S. Engelmore Memorial Award recognises outstanding contributions to automated planning, machine learning and robotics, their application to real-world problems and extensive service to the AI community. The annual AAAI/EAAI Outstanding Educator award was created to honour a person (or group of people) who has made major contributions to AI education that provide long-lasting benefits to the AI community and society as a whole.
Position: We Need Responsible, Application-Driven (RAD) AI Research
Hartman, Sarah, Ong, Cheng Soon, Powles, Julia, Kuhnert, Petra
This position paper argues that achieving meaningful scientific and societal advances with artificial intelligence (AI) requires a responsible, application-driven approach (RAD) to AI research. As AI is increasingly integrated into society, AI researchers must engage with the specific contexts where AI is being applied. This includes being responsive to ethical and legal considerations, technical and societal constraints, and public discourse. We present the case for RAD-AI to drive research through a three-staged approach: (1) building transdisciplinary teams and people-centred studies; (2) addressing context-specific methods, ethical commitments, assumptions, and metrics; and (3) testing and sustaining efficacy through staged testbeds and a community of practice. We present a vision for the future of application-driven AI research to unlock new value through technically feasible methods that are adaptive to the contextual needs and values of the communities they ultimately serve.
What will the AI revolution mean for the global south?
I come from Trinidad and Tobago. As a country that was once colonized by the British, I am wary of the ways that inequalities between the global north and global south risk being perpetuated in the digital age. When we consider the lack of inclusion of the global south in discussions about artificial intelligence (AI), I think about how this translates to an eventual lack of economic leverage and geopolitical engagement in this technology that has captivated academics within the industrialised country I reside, the United States. As a scientist, I experienced an early rite of passage into the world of Silicon Valley, the land of techno-utopianism, and the promise of AI as a net positive for all. But, as an academic attending my first academic AI conference in 2019, I began to notice inconsistencies in the audience to whom the promise of AI was directed.
WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking
While deep learning has revolutionized computer-aided drug discovery, the AI community has predominantly focused on model innovation and placed less emphasis on establishing best benchmarking practices. We posit that without a sound model evaluation framework, the AI community's efforts cannot reach their full potential, thereby slowing the progress and transfer of innovation into real-world drug discovery.Thus, in this paper, we seek to establish a new gold standard for small molecule drug discovery benchmarking, WelQrate. Specifically, our contributions are threefold: WelQrate dataset collection - we introduce a meticulously curated collection of 9 datasets spanning 5 therapeutic target classes. Our hierarchical curation pipelines, designed by drug discovery experts, go beyond the primary high-throughput screen by leveraging additional confirmatory and counter screens along with rigorous domain-driven preprocessing, such as Pan-Assay Interference Compounds (PAINS) filtering, to ensure the high-quality data in the datasets; WelQrate Evaluation Framework - we propose a standardized model evaluation framework considering high-quality datasets, featurization, 3D conformation generation, evaluation metrics, and data splits, which provides a reliable benchmarking for drug discovery experts conducting real-world virtual screening; Benchmarking - we evaluate model performance through various research questions using the WelQrate dataset collection, exploring the effects of different models, dataset quality, featurization methods, and data splitting strategies on the results.In summary, we recommend adopting our proposed WelQrate as the gold standard in small molecule drug discovery benchmarking. The WelQrate dataset collection, along with the curation codes, and experimental scripts are all publicly available at www.WelQrate.org.
AI for Just Work: Constructing Diverse Imaginations of AI beyond "Replacing Humans"
Jin, Weina, Vincent, Nicholas, Hamarneh, Ghassan
The AI community usually focuses on "how" to develop AI techniques, but lacks thorough open discussions on "why" we develop AI. Lacking critical reflections on the general visions and purposes of AI may make the community vulnerable to manipulation. In this position paper, we explore the "why" question of AI. We denote answers to the "why" question the imaginations of AI, which depict our general visions, frames, and mindsets for the prospects of AI. We identify that the prevailing vision in the AI community is largely a monoculture that emphasizes objectives such as replacing humans and improving productivity. Our critical examination of this mainstream imagination highlights its underpinning and potentially unjust assumptions. We then call to diversify our collective imaginations of AI, embedding ethical assumptions from the outset in the imaginations of AI. To facilitate the community's pursuit of diverse imaginations, we demonstrate one process for constructing a new imagination of "AI for just work," and showcase its application in the medical image synthesis task to make it more ethical. We hope this work will help the AI community to open dialogues with civil society on the visions and purposes of AI, and inspire more technical works and advocacy in pursuit of diverse and ethical imaginations to restore the value of AI for the public good.
Congratulations to the #AAAI2025 award winners
A number of prestigious AAAI awards were presented during the official opening ceremony of the Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI 2025) on 27 February. Some of the winners will also be giving invited talks as part of the programme. The AAAI Award for Artificial Intelligence for Humanity recognises the positive impacts of artificial intelligence to protect, enhance, and improve human life in meaningful ways with long-lived effects. The winner of this year's award is Stuart J. Russell (University of California, Berkeley, USA). Stuart has been recognised for "work on the conceptual and theoretical foundations of provably beneficial AI and his leadership in creating the field of AI safety".
Toward a Cohesive AI and Simulation Software Ecosystem for Scientific Innovation
Heroux, Michael A., Shende, Sameer, McInnes, Lois Curfman, Gamblin, Todd, Willenbring, James M.
ParaTools, Inc. Sameer Shende, ParaTools, Inc. Lois Curfman McInnes, Argonne National Laboratory Todd Gamblin, Lawrence Livermore National Laboratory James M. Willenbring, Sandia National Laboratories In this document, we outline key considerations for the next-generation software stack that will support scientific applications integrating AI and modeling & simulation (ModSim) to provide a unified AI/ModSim software stack. The scientific computing community needs a cohesive AI/ModSim software stack. This AI/ModSim stack must support binary distributions to enable emerging scientific workflows. A Cohesive Software Stack for AI and Modeling & Simulation To address future scientific challenges, the next-generation scientific software stack must provide a cohesive portfolio of libraries and tools that facilitate AI and ModSim approaches. As scientific research becomes increasingly interdisciplinary, scientists require both of these toolsets to address complex, data-rich problems in problem domains such as climate modeling, material discovery, and energy optimization.